{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "27bc87b7",
   "metadata": {},
   "source": [
    "# Amazon Neptune Property Graph Store"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "78b60432",
   "metadata": {},
   "outputs": [],
   "source": [
    "%pip install boto3 nest_asyncio\n",
    "%pip install llama-index-llms-bedrock\n",
    "%pip install llama-index-graph-stores-neptune\n",
    "%pip install llama-index-embeddings-bedrock"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "be3f7baa-1c0a-430b-981b-83ddca9e71f2",
   "metadata": {},
   "source": [
    "## Using Property Graph with Amazon Neptune"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "97221c15",
   "metadata": {},
   "source": [
    "### Add the required imports"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c79c7f2e",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.llms.bedrock import Bedrock\n",
    "from llama_index.embeddings.bedrock import BedrockEmbedding\n",
    "from llama_index.core import (\n",
    "    StorageContext,\n",
    "    SimpleDirectoryReader,\n",
    "    PropertyGraphIndex,\n",
    "    Settings,\n",
    ")\n",
    "from llama_index.graph_stores.neptune import (\n",
    "    NeptuneAnalyticsPropertyGraphStore,\n",
    "    NeptuneDatabasePropertyGraphStore,\n",
    ")\n",
    "from IPython.display import Markdown, display"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f553e01f",
   "metadata": {},
   "source": [
    "### Configure the LLM to use, in this case Amazon Bedrock and Claude 2.1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "032264ce",
   "metadata": {},
   "outputs": [],
   "source": [
    "llm = Bedrock(model=\"anthropic.claude-3-sonnet-20240229-v1:0\")\n",
    "embed_model = BedrockEmbedding(model=\"amazon.titan-embed-text-v2:0\")\n",
    "\n",
    "Settings.llm = llm\n",
    "Settings.embed_model = embed_model\n",
    "Settings.chunk_size = 512"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "75f1d565-04e8-41bc-9165-166dc89b6b47",
   "metadata": {},
   "source": [
    "### Building the Graph\n",
    "\n",
    "### Read in the sample file"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1c297fd3-3424-41d8-9d0d-25fe6310ab62",
   "metadata": {},
   "outputs": [],
   "source": [
    "import nest_asyncio\n",
    "\n",
    "# Add this so that it works with the event loop in Jupyter Notebooks\n",
    "nest_asyncio.apply()\n",
    "\n",
    "documents = SimpleDirectoryReader(\n",
    "    \"../../../../examples/paul_graham_essay/data\"\n",
    ").load_data()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f0edbc99",
   "metadata": {},
   "source": [
    "### Instantiate Neptune Property Graph Indexes\n",
    "\n",
    "When using Amazon Neptune you can choose to use either Neptune Database or Neptune Analytics.\n",
    "\n",
    "Neptune Database is a serverless graph database designed for optimal scalability and availability. It provides a solution for graph database workloads that need to scale to 100,000 queries per second, Multi-AZ high availability, and multi-Region deployments. You can use Neptune Database for social networking, fraud alerting, and Customer 360 applications.\n",
    "\n",
    "Neptune Analytics is an analytics database engine that can quickly analyze large amounts of graph data in memory to get insights and find trends. Neptune Analytics is a solution for quickly analyzing existing graph databases or graph datasets stored in a data lake. It uses popular graph analytic algorithms and low-latency analytic queries.\n",
    "\n",
    "\n",
    "#### Using Neptune Database\n",
    "If you can choose to use [Neptune Database](https://docs.aws.amazon.com/neptune/latest/userguide/feature-overview.html) to store your property graph index you can create the graph store as shown below."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "31ca71c6",
   "metadata": {},
   "outputs": [],
   "source": [
    "graph_store = NeptuneDatabasePropertyGraphStore(\n",
    "    host=\"<GRAPH NAME>.<CLUSTER ID>.<REGION>.neptune.amazonaws.com\", port=8182\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "67418411",
   "metadata": {},
   "source": [
    "#### Neptune Analytics\n",
    "\n",
    "If you can choose to use [Neptune Analytics](https://docs.aws.amazon.com/neptune-analytics/latest/userguide/what-is-neptune-analytics.html) to store your property index you can create the graph store as shown below."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b6b11a9d",
   "metadata": {},
   "outputs": [],
   "source": [
    "graph_store = NeptuneAnalyticsPropertyGraphStore(\n",
    "    graph_identifier=\"<INSERT GRAPH IDENIFIER>\"\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "370fd08f-56ff-4c24-b0c4-c93116a6d482",
   "metadata": {},
   "outputs": [],
   "source": [
    "storage_context = StorageContext.from_defaults(graph_store=graph_store)\n",
    "\n",
    "# NOTE: can take a while!\n",
    "index = PropertyGraphIndex.from_documents(\n",
    "    documents,\n",
    "    property_graph_store = graph_store,\n",
    "    storage_context=storage_context\n",
    ")\n",
    "\n",
    "# Loading from an existing graph\n",
    "    index = PropertyGraphIndex.from_existing(\n",
    "        property_graph_store=graph_store\n",
    "    )"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c39a0eeb-ef16-4982-8ba8-b37c2c5f4437",
   "metadata": {},
   "source": [
    "#### Querying the Property Graph\n",
    "\n",
    "First, we can query and send only the values to the LLM."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "670300d8-d0a8-4201-bbcd-4a74b199fcdd",
   "metadata": {},
   "outputs": [],
   "source": [
    "query_engine = index.as_query_engine(\n",
    "    include_text=True,\n",
    "    llm=llm,\n",
    ")\n",
    "\n",
    "response = query_engine.query(\"Tell me more about Interleaf\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "eecf2d57-3efa-4b0d-941a-95438d42893c",
   "metadata": {},
   "outputs": [],
   "source": [
    "display(Markdown(f\"<b>{response}</b>\"))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "70ff01f3",
   "metadata": {},
   "source": [
    "Second, we can use the query using a retriever"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ba48ad92",
   "metadata": {},
   "outputs": [],
   "source": [
    "retriever = index.as_retriever(\n",
    "    include_text=True,\n",
    ")\n",
    "\n",
    "nodes = retriever.retrieve(\"What happened at Interleaf and Viaweb?\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "84d44440",
   "metadata": {},
   "source": [
    "Third, we can use a `TextToCypherRetriever` to convert natural language questions into dynamic openCypher queries"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "bcdc59f0",
   "metadata": {},
   "outputs": [],
   "source": [
    "from llama_index.core.indices.property_graph import TextToCypherRetriever\n",
    "\n",
    "cypher_retriever = TextToCypherRetriever(index.property_graph_store)\n",
    "\n",
    "nodes = cypher_retriever.retrieve(\"What happened at Interleaf and Viaweb?\")\n",
    "print(nodes)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d0d5c8e3",
   "metadata": {},
   "source": [
    "Finally, we can use a `CypherTemplateRetriever` to provide a more constrained version of the `TextToCypherRetriever`. Rather than letting the LLM have free-range of generating any openCypher statement, we can instead provide a openCypher template and have the LLM fill in the blanks."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "06aba3cd",
   "metadata": {},
   "outputs": [],
   "source": [
    "from pydantic import BaseModel, Field\n",
    "from llama_index.core.indices.property_graph import CypherTemplateRetriever\n",
    "\n",
    "cypher_query = \"\"\"\n",
    "    MATCH (c:Chunk)-[:MENTIONS]->(o)\n",
    "    WHERE o.name IN $names\n",
    "    RETURN c.text, o.name, o.label;\n",
    "   \"\"\"\n",
    "\n",
    "\n",
    "class TemplateParams(BaseModel):\n",
    "    \"\"\"Template params for a cypher query.\"\"\"\n",
    "\n",
    "    names: list[str] = Field(\n",
    "        description=\"A list of entity names or keywords to use for lookup in a knowledge graph.\"\n",
    "    )\n",
    "\n",
    "\n",
    "cypher_retriever = CypherTemplateRetriever(\n",
    "    index.property_graph_store, TemplateParams, cypher_query\n",
    ")\n",
    "nodes = cypher_retriever.retrieve(\"What happened at Interleaf and Viaweb?\")\n",
    "print(nodes)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
